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ABSTRACT 

A detailed X-ray analysis of an XMM-Newton observation of the high-redshift 
(z=0.89) galaxy cluster C1J1226. 9+3332 is presented. After careful consider- 
ation of background subtraction issues, the X-ray temperature is found to be 
11.5^5'g keV, the highest X-ray temperature of any cluster at z > 0.6. The 
temperature is consistent with the observed velocity dispersion. In contrast 
to MS1054-0321, the only other very hot cluster currently known at z > 0.8, 
C1J1226. 9+3332 features a relaxed X-ray morphology, and its high overall gas 
temperature is not caused by one or several hot spots. The system thus con- 
stitutes a unique example of a high redshift (z>0.8), high temperature (T>10 
keV), relaxed cluster, for which the usual hydrostatic equilibrium assumption, 
and the X-ray mass is most reliable. 

A temperature profile is constructed (for the first time at this redshift) and 
is consistent with the cluster being isothermal out to 45% of the virial radius. 
Within the virial radius (corresponding to a measured overdensity of a factor 
of 200), a total mass of 1.4 + 0.5 x 1O 15 M is derived, with a gas mass fraction 
of 12 ± 5% (for a ACDM cosmology and H o =70 km s" 1 Mpc" 1 ). This total 
mass is similar to that of the Coma cluster. The bolometric X-ray luminosity 
is 5.3 q 2 ^ 

10 erg s . Analysis of a short Chandra observation confirms 
the lack of significant point-source contamination, the temperature, and the 
luminosity, albeit with lower precision. The probabilities of finding a cluster 
of this mass within the volume of the discovery X-ray survey are ~ 8 x 10~ 5 
for Ojvf = 1 and 0.64 for VLm = 0.3, making f2j\/ = 1 highly unlikely. 

The entropy profile suggests that entropy evolution is being observed. The 
metal abundance (of Z = 0.33+q "*qZ©), gas mass fraction, and gas distribution 
are consistent with those of local clusters; thus the bulk of the metals were in 
place by z=0.89. 

Key words: cosmology: observations - galaxies: clusters: general - galaxies: 
high-redshift - galaxies: clusters: individual: (C1J1226. 9+3332) - intergalactic 
medium - X-rays: galaxies 



1 INTRODUCTION 

Massive galaxy clusters form from the high-sigma tail of 
the initial cosmological density distribution. As a result 
they are rare, but also very powerful probes of cosmol- 
ogy. Given an assumed initial density distribution, the 
properties of the massive cluster population can be pre- 
dicted under alternate cosmologies, and those predic- 
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tions tested with observations. The predictions of dif- 
ferent cosmologies diverge with redshift, making high- 
redshift, massive clusters the most useful objects to dis- 
tinguish between them. 

X-ray observations of galaxy clusters provide a use- 
ful means of measuring their properties. The intra- 
cluster gas is extremely luminous in X-rays, and mea- 
surements of the gas temperature and density distri- 
butions allow the total mass of the system (the prop- 
erty most directly related to cosmological predictions) 
to be inferred. This inference requires, however, that 
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the intra-cluster medium (ICM) be in hydrostatic equi- 
librium. If, as believed, clusters form through a series 
of hierarchical mergers, then this will only be the case 
some time after the last merger. Thus the most useful 
objects to study for the purpose of constraining cosmo- 
logical models in this way, are high-redshift, massive, 
relaxed clusters of galaxies. These are extremely rare. 

However the question of when a cluster can be con- 
sidered to be relaxed is something of a contentious issue. 
In a study of 368 low— z clusters observed by Einstein 
, Jones & Forman (1999) found « 40% to have sub- 
structure in their X-ray images. This fraction is likely 
to be an underestimate, as Einstein was unable to re- 
solve small scale substructure. More recently, the high 
resolving power of Chandra has revealed substructure in 
clusters that were previously considered to be relaxed, 
such as A1795 (Fabian et al. 2001; Markevitch et al. 
2001), and MS 1455.0+2232 (Mazzotta et al. 2002). On 
the other hand, X-ray derived masses of clusters that ap- 
pear relaxed in Chandra observations have been found 
to agree well with independent weak lensing mass mea- 
surements, at least in the inner regions (Allen et al. 
2001, and references therein). 

While Chandra can accurately probe the gas prop- 
erties in the central regions of clusters, the strength of 
XMM-Newton lies in its large collecting area, which al- 
lows it to trace the gas density and temperature struc- 
ture out into the low surface-brightness emission at large 
radii, even at high redshifts. This minimises the uncer- 
tainties involved in extrapolating these properties out 
to the virial radius when deriving the total mass of the 
system. The mass composition of massive galaxy clus- 
ters (e.g. the baryonic to total mass fraction) is believed 
to be representative of the universe as a whole, due to 
their large size (e.g. Allen et al. 2002). Thus by directly 
observing the ICM out to large radii, one obtains a more 
representative measurement of these properties. 

In June 2001, XMM-Newton made a 30 ks observa- 
tion of galaxy cluster C1J1226. 9+3332, one of the most 
distant, luminous clusters found in the WARPS X-ray 
selected survey (Scharf et al. 1997; Jones et al. 1998; 
Ebeling et al. 2000; Perlman et al. 2002). The cluster 
was positioned ~ 4' off-axis in order to investigate other 
candidate clusters in the field, to be described in a fu- 
ture paper. The discovery ROSAT data indicated a high 
X-ray luminosity, but were insufficient to accurately de- 
termine morphology or temperature. Optical follow up 
found the cluster's redshift to be 0.89 (Ebeling et al. 
2001), corresponding to a look back time of over half 
of the age of the universe, and Sunyaev-Zel'dovich ef- 
fect imaging confirmed that the cluster is both hot and 
massive (Joy et al. 2001). We present here the results 
of a detailed analysis of the XMM-Newton data. There 
is also a fairly short (10 ks) archived Chandra observa- 
tion of C1J1226. 9+3332, which has been examined by 
Cagnoni et al. (2001). We have analysed these data in 
a way consistent with our XMM-Newton analysis in or- 
der to check consistency, and we draw comparisons at 
several relevant points. 

Throughout this paper, a cosmology of Ho = 
70 km s _1 Mpc -1 , and Q M = 0.3 (fi A = 0.7) is 
adopted, unless stated otherwise, and all errors are 
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Figure 1. Lightcurvc of the pn observation of 
C1J1226.9+3332, in 50 s bins in the range 10 - 15 kcV. The 
bar below the lightcurvc indicates the good time intervals 
left after cleaning, and the dashed line indicates the 3<r 
cut-off level (see text). 

quoted at the 68% level. At the cluster's redshift, 1" cor- 
responds to 7.8 kpc in this cosmology. The virial radius 
(r20o) is defined as the radius within which the mean 
density is 200 times the critical density at the redshift 
of observation. 



2 DATA PREPARATION 

The data from the PN and two MOS detectors were 
processed with the processing chains, epchain (PN) and 
emchain (MOS) as these have been found to be signif- 
icantly better at removing bad events and pixels than 
the standard 'procs' (epproc and emproc) . Examination 
of the processed PN events showed that a few bad pix- 
els (two rows, and one pixel) were undetected by the 
chain, and these were added to the bad pixel tables, 
and the data was reprocessed. Lightcurves of the three 
detectors, produced in the 10 — 15 keV band showed 
that the observation was contaminated by several large 
background flares. The periods of very high background 
were selected by eye, and removed from the lightcurve, 
before the remaining data were cleaned by a recursive 

3 — o clipping algorithm to leave a stable mean rate. The 
lightcurve of the PN detector is shown in Fig. 1, with 
the accepted times indicated by the bar underneath the 
lightcurve. 

Events were filtered on the basis of their pattern pa- 
rameter, which indicates the geometry of the detection 
of each event, i.e. the number of adjacent pixels that de- 
tect each photon. Events whose patterns are considered 
well calibrated (PN - single and double, MOS - single, 
double, and quadruple) were retained in the filtering. 

In the analysis of XMM-Newton data, one must 
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carefully account for the background contamination. 
There are, broadly speaking, two ways of doing this; 
one may sample the background locally, from the same 
observation as the source, or one may use a 'blank-sky' 
background dataset, composed of many observations, 
with all bright sources removed (Lumb et al. 2002). 

The background is composed of three types of 
events: 

• Soft protons - this component is believed to be 
caused by solar flares, and the intensity and spectrum 
of this component varies significantly with time. It is 
the dominant component during flaring intervals, but in 
quiescent periods its contribution is the smallest. This 
component is vignetted, but may not have the same vi- 
gnetting function as the X-rays. 

• X-rays - this component dominates the background 
at low energies (< 1.5 keV), and varies spatially across 
the sky (though not significantly across the field-of- 
view). This component is vignetted by the telescopes. 

• Cosmic-ray induced particles - this component 
dominates at high energies, and is induced by high en- 
ergy cosmic rays passing unvignetted through the in- 
strument. This component is referred to hereafter as 
the particle background. 
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Figure 2. Contours of X-ray emission detected by XMM- 
Newton 0.3 — 8 keV overlaid on a Subaru I-Band image 
of cluster C1J1226. 9+3332. The contours were taken of 
data from the three cameras combined, that was adaptively 
smoothed so that all features were significant at the 99% 
level. The contours are logarithmically spaced above the low- 
est contour at 0.45 counts pixel -1 . 



This observation of C1J 1226.9+3332 appears to be 
contaminated by a particularly high background level, 
even after lightcurve cleaning. As shown in Fig. 1, there 
are two intervals of lower background separated by a 
large flaring event. During the first interval (2 — 11 ks), 
for the PN camera, the average count rate (10—15 keV) 
was 1.81 counts s~\ while in the second(18 — 26 ks), the 
mean rate was 0.97 counts s~ . For comparison, the PN 
count rate in the blank-sky datasets in this energy band 
was 0.52 counts s _1 . Even considering the 10—20% vari- 
ations in the background level found by Lumb et al. 
(2002), the background in these two periods is higher 
than would normally be acceptable. The count rates 
outside the field of view, which consist only of particle 
events, were also compared. The count rate was a factor 
of 1.7 higher in the C1J1226.9+3332 dataset than the 
blank-sky data. This shows that the high background 
level in this dataset is due to high levels of both par- 
ticles and soft protons. This significantly increases the 
difficulty and uncertainties involved with using a blank- 
sky background in the analysis. 

Due to the high background, a careful comparison 
of spectral analysis methods was made (using both local 
and blank-sky backgrounds), on both of the time peri- 
ods separately, and combined (for simplicity, hereafter, 
the first period (2 — 11 ks) will be referred to as the 
"high background period", and the second (18 — 26 ks) 
will be referred to as the "low background period"). 
This analysis is described in some detail in the follow- 
ing sections, but the general conclusion was that in the 
low background period, all methods gave consistent re- 
sults, and that if a local background was used, then the 
results from the high background and low background 
periods, and both periods combined were consistent. As 
discussed in §4.6 our final results are taken from the 
combined periods with a local background, which con- 



tained a useful time of 14 ks for the PN detector, and 
18 ks for each of the two MOS detectors. 

The Chandra observation was performed with the 
ACIS-S array exposed, with the target on the S3 chip. 
Only standard data preparation was required, as there 
were no significant background flares during the short 
exposure. 



3 IMAGING ANALYSIS 

A combined, exposure-corrected image of the datasets 
of the PN and two MOS cameras in the energy band 
0.3 — 8 keV was produced, and adaptively smoothed. 
Contours of this smoothed emission are shown in Fig. 
2, overlaid on an optical image. The outer contours 
are reasonably circular, suggesting the X-ray emitting 
gas is fairly relaxed. The lowest contour, at a level of 
0.45 counts pixel -1 (which is 1.5 times the background 
level) is distorted due to the point source in the West, 
and truncated slightly along the South-East edge due 
to a PN CCD gap that was not fully removed by the 
exposure correction. 

For comparison, we also overlay contours produced 
in the same way from the archived Chandra observation 
of C1J1226. 9+3332 on the same optical image in Fig. 3. 
It is clear from Fig. 3 that there are no strong point 
sources unresolved in the XMM-Newton observation. 

3.1 Two-dimensional modelling of the X-ray 
emission 

A two-dimensional (2D) model of the X-ray emission 
was fit to the XMM-Newton data, taking the different 
background components and instrumental effects into 
account. The approach followed was to bin the data into 
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Figure 3. Contours of X-ray emission detected by Chandra 
(0.5 — 5 kcV) overlaid on the same Subaru I-Band image 
as Fig 2. The contours are taken from an exposure-corrected 
image that was adaptively smoothed so that all features were 
significant at the 99% level. The contours arc logarithmically 
spaced above the lowest contour at 0.03 counts pixel -1 . 



an image with 4.4" pixels, but apply no vignetting cor- 
rection, or any further manipulation of the data. This 
pixel size was chosen so as to be an integer multiple of 
the 1.1" pixels of the point spread function (PSF) im- 
ages produced by the SAS tool calview, allowing them to 
be re-binned to the same scale, and to be large enough 
to reduce computing time in the fitting procedure, with- 
out losing resolution. An image of a dataset obtained 
with the filter in the closed position (and thus blocking 
all X-rays) was filtered in the same way as the source 
data, and normalised to it, using the ratio of outside 
field-of-view counts in the two sets, creating a 'particle 
image'. This was smoothed with a Gaussian of a = 20" 
to prevent the fitting being biased by noise, while main- 
taining any larger scale spatial variation of this back- 
ground component. The particle image was then divided 
by an exposure map, giving an anti-vignetted image of 
the particle component of the background. The expo- 
sure map was also used to make a binary filter mask to 
exclude the CCD gaps and bad pixels from the fit. 

In a background region of the data (on the same 
CCD where possible), a model comprised of the anti- 
vignetted particle image in that region, plus a flat com- 
ponent to represent the X-ray background, were multi- 
plied by the exposure map, convolved with the PSF, and 
fit to the data, with both background component nor- 
malisations free to vary. This meant that in effect, the 
background was fit with a flat particle background, and 
a vignetted X-ray plus soft-proton background. We note 
that the best-fitting normalisation of the particle back- 
ground varied by less than 5% from its initial value. This 
indicates that the normalisation to the outside field-of- 
view events was accurate, and therefore the vignetted- 
background level found here should also be accurate. 
These background normalisations were then fixed, and 
the source was modelled with the anti-vignetted particle 
image in the source region, plus a flat X-ray background, 



plus a 2D /3-profile 1 , all multiplied by the exposure map 
and convolved with the PSF. 

This procedure was followed for the PN and MOS 
cameras, and then the fits were performed simultane- 
ously, with each of the three models using their appro- 
priate exposure map, fitted background levels, and PSF 
(images of the PSF of each telescope were generated at 
1.5 keV, corresponding to the peak effective area, and 
at an appropriate off-axis angle). The amplitudes of the 
models were independent, but they were constrained to 
fit to the same slope, core radius, central position, ellip- 
ticity, and rotation angle. The best-fitting model had a 
core radius r c = 14.5 jl ; g", a slope (3 = 0.66l Q2, and an 
ellipticity of 0.14 (while all parameters were free to vary 
in the error computation, errors were only computed on 
r c and /3 because of the computational load involved). 
The fitting was repeated with the PN and combined 
MOS data separately, and the best-fitting parameters 
were found to be consistent throughout. 



3.2 One-dimensional surface-brightness profile 

In order to measure the extent of the emission, and to 
investigate the goodness-of-fit of the 2D model to the 
data, a one-dimensional (ID) surface-brightness profile 
of the emission in an exposure corrected, combined im- 
age from the three XMM-Newton EPIC cameras was 
produced. Before the exposure correction, the exposure 
maps were normalised to their value at the cluster cen- 
troid, thereby maintaining, as much as possible, the 
Poissonian statistics in an exposure-corrected image. 

The profile was centred on the X-ray centroid 
(q[2000.0] = 12 h 26 m 57.94 s , 5[2000.0] = +33°32'46.2"), 
and the circular radial bins were adaptively sized so that 
each contained a detection with a signal-to-noise ratio 
of at least 3 (the background level being estimated from 
a large concentric annulus - we note that this is likely 
to be an overestimate of the background at the cluster 
centre, due to the anti-vignetted particle background). 
The emission was detected out to 100" (776 kpc) at the 
3<J level. 

The 2D analysis is superior to the ID analysis, not 
least because we do not account for the PSF in the 
ID analysis. We can however test the goodness of fit 
of the 2D model in the following way. A ID profile of 
the best-fitting 2D model convolved with the PSF was 
made, and compared to the observed ID profile. A ID 
/3-model (Cavaliere & Fusco-Femiano 1976) plus a flat 
background was fit to both the profile of the data and 
the 2D model and the best-fitting parameters were in 
good agreement. In the fit to the data, the reduced x 2 
was 1.09 for 53 degrees of freedom. The best-fitting ID 
models to the data and 2D model are overlaid on a pro- 
file of the data in fig 4. These comparisons indicate that 
the 2D model provides a good description of the data. 



1 http:/ /cxc. harvard. edu/ciao2.3/download/doc/ 
sherpaJitmLmanual / rcfmodcls .html 
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Figure 4. Adaptivcly binned XMM-Newton surface- 
brightness profile of C1J1226. 9+3332, with each bin contain- 
ing a signal-to-noise ratio of at least three. The lines show 
the best-fitting ID model to the data (solid line) and two- 
dimensional model (dashed line). 



3.3 Chandra analysis and the central region 

The archived Chandra observation of C1J1226. 9+3332 
was also subjected to a similar ID and 2D analysis, and 
the best-fitting model parameters were consistent with 
those derived from the XMM-Newton data, but of lower 
statistical precision. 

Many relaxed clusters show excess emission above 
a /3-model due to cool, dense gas in central regions, pre- 
viously referred to as a cooling flow (Fabian 1994). The 
residuals of the XMM-Newton and Chandra data af- 
ter subtraction of the best-fitting 2D models were ex- 
amined, and while both showed a weak central excess, 
these features were not statistically significant. Exclud- 
ing the central regions (r < 5" for Chandra, r < 20" for 
XMM-Newton, consistent with the PSF) in the profile 
fits also gave no significant change to the best-fitting 
model parameters. 



3.4 Hardness-ratio Mapping 

The temperature structure of the cluster was probed 
with hardness-ratio mapping. The hardness ratio, HR, 
was defined as 



HR 



H - AH, 



S ASbg 



(1) 



where H and S are the counts in the source region in 
the hard and soft band respectively, and the hg subscript 
indicates the counts found in a background region. A is 
the ratio of the area of the source region to the back- 
ground region. Assuming that the errors on each pixel 
are Poissonian and uncorrelated, the error on the hard- 
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Figure 5. Adaptivcly binned HR significance map of the 
XMM-Newton data. The overlaid contours are the same as 
in Fig. 2. The dashed box contains regions of significantly 
lower temperature than the mean, assuming no variation in 
absorbing column. 



ness ratio is then given by 



a(HR) 



H + A Hb g (H 
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A soft band of 0.3 — 1.1 keV, and a hard band of 
1.1 — 8 keV were chosen when computing the ratios be- 
cause these band had similar numbers of net counts. Im- 
ages of the cluster emission produced in these hard and 
soft bands were binned up adaptively, in order to max- 
imise the signal to noise in each bin while maintaining 
good resolution. The minimum number of background- 
subtracted counts (0.3 — 8 keV) was set to 150 per bin, 
although a few bins were allowed to fall below this 
threshold to improve the resolution. The resultant im- 
ages were then divided to give a hardness-ratio map. 
A series of absorbed MeKaL spectra were simulated 
at different temperatures (assuming a constant Galac- 
tic absorption of 1.38 x 10 20 cm~ 2 (Dickey & Lockman 
1990), and fixed metallicity of 0.3Z Q ), convolved with 
the appropriate instrument responses, and the number 
of counts in the hard and soft bands were found. This 
enabled the conversion between HR values and approx- 
imate temperatures. 

Fig. 5 is an image of the differences between the 
HR in each bin from the HR corresponding to the global 
spectrally-measured temperature (11.5 keV; see §5) di- 
vided by the errors on both the local HR and the HR 
of the global temperature added in quadrature. Pixels 
where the broadband net counts were < 50 are excluded, 
and the remaining pixels have an average of 140 counts. 
This significance map shows that, within the limits of 
the data, the emission is generally isothermal; 66% of 
the pixels are within la of the HR corresponding to the 
global temperature, and 95% are < 2a from this HR. 

A region of significantly cooler emission to the west 
of the cluster centre is marked with a white, dashed 
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box in Fig. 5. The two darkest pixels here are just over 
3a softer than the global temperature HR (note that 
we would expect only 0.12 pixels to be > 3a from 
the mean if they were randomly distributed). Spec- 
tra were extracted within this region, and fit with an 
absorbed MeKaL model. The best-fitting temperature 
was 6.5 keV with the abundance frozen at 0.3Z Q , 
or 5.9tJ.9 keV with a poorly constrained abundance of 
O.8Z . The reduced \ 2 in both cases was 1.3 for 29 (or 
28) degrees of freedom, suggesting that, though the sta- 
tistical errors are large, a simple MeKaL model is not 
a good description of the emission from this region (it 
is excluded at the 88% level). The Chandra data indi- 
cate that point source contamination is unlikely; a more 
likely cause is multi-temperature gas in this region. A 
plausible explanation is that we are observing the in-fall 
of some cooler (< 6 keV) body, whose emission is mixed 
with that from the hotter gas along the line of sight. 



4 SPECTRAL ANALYSIS METHODS 

When performing spectral analysis, one must be par- 
ticularly careful to treat the background components 
correctly, as failing to do so can strongly influence the 
results. We have performed a thorough investigation of 
different methods of treating the background spectra, 
which are in general extracted in one of two ways. One 
may extract a local background spectrum from a large 
region of the same CCD as the source emission. This 
method has the disadvantage that instrumental features 
in the background spectrum and the response of the de- 
tectors vary across the CCD, and the background spec- 
trum will be more severely vignetted than the source 
spectrum, tending to come from further off-axis. 

Alternatively, one may use a background spectrum 
extracted from a blank-sky dataset, which is a combi- 
nation of several observations with all bright sources 
removed. This method has the advantage that the spec- 
trum can be extracted from the same detector region 
as the source spectrum, and that the effective exposure 
time of the blank-sky dataset can be many times longer 
than that of the observation, reducing the Poissonian 
errors on the background spectrum. The disadvantages 
of this method are that the blank-sky observations are 
taken at different times and pointings to the source data, 
and the background varies both directionally and tem- 
porally. In particular the soft X-ray background varies 
directionally due to absorption in the galaxy, and emis- 
sion from the local bubble, while the shape and ampli- 
tude of the non-X-ray background spectrum varies tem- 
porally due to soft-proton flaring events, and variations 
in the particle flux. 

In principle, one can compensate for the vignetting 
of the telescope by weighting each event (using SAS 
5.3's evigweight). The weight is derived from the ra- 
tio of the effective area at the position and energy of 
each event, to the effective area at that energy on-axis. 
This method is described in detail by, for example, Ar- 
naud et al. (2002). The disadvantage of applying this 
weighting is that non- vignetted particle induced events 
are also weighted, artificially boosting their contribu- 



tion. This effect can be avoided when using a blank-sky 
background because the source and background spectra 
are extracted from the same detector region. Providing 
that the particle contribution is the same in the source 
and blank-sky datasets, then the particle weighting ef- 
fect will be the same, and its effect will cancel when 
the spectra are subtracted. The spectra produced from 
these weighted datasets should resemble the spectra one 
would detect with a flat detector, so the on-axis Ancil- 
lary Response File is used when performing the spectral 
fitting. 

Thus four spectral background methods were in- 
vestigated; local and blank-sky backgrounds, with and 
without weighting. These methods were applied to the 
low background and high background periods (see §2), 
and both periods combined. In each case, the source 
spectrum was extracted from a circle of radius 100" cen- 
tred on the cluster centroid. We note that this region 
crosses a PN CCD gap, but the responses of the PN 
CCDs are identical, and do not vary strongly across the 
chip, so this should not present a significant source of 
uncertainty. The spectra were all fit with an absorbed 
MeKaL model, in the range 0.3 — 8 keV, with abun- 
dances fixed at 0.3Zq and the absorbing column density 
fixed at the Galactic value of 1.38 x 10 20 cm~ 2 (Dickey 
& Lockman 1990). 

4.1 The Low Background Period 

All of the background methods described below gave 
temperatures consistent with 11.5±2 keV. We take this 
as a reliable measurement of the temperature, free of 
systematic errors, but now check the results when, in 
addition, the high background period is included. 

4.2 Local background, no weighting 

This method is the most straightforward, and given the 
high background level in this dataset, is likely to be 
the most reliable. A background spectrum was extracted 
from a large region of the same CCD, at ?a 250" from 
the cluster centre. This was far enough to avoid contam- 
inating emission, but as close as possible to reduce the 
difference in effective area between the source and back- 
ground regions. The best-fitting temperature was T — 
11.56 ± 1.26 keV with a reduced x7 rfo / = 0.93/298. 
We also investigated the dependence of the result on 
the background region chosen, by using two other back- 
ground regions, and the best-fitting temperatures were 
all consistent within their la errors. 

4.3 Local background, with weighting 

This method is similar to the preceding one, except 
the spectrum is produced from weighted events, as de- 
scribed above. This method should reduce the discrep- 
ancy between the effective area at the source region 
and the background region. However the contribution 
of particle induced events will be incorrectly boosted, 
and will be boosted more strongly in the background 
region which is further off-axis. 

We extracted weighted source and background 
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spectra from the same regions used above. The best- 
fitting temperature was T = 4.44 ± 0.49 keV (reduced 
X /dof = 1 .15/340), significantly lower than that found 
with the non-weighted spectrum above. This suggests 
that the anti- vignetting of the particle events, combined 
with their high level, has a strong effect in these data. 

4.4 blank-sky background, no weighting 

This method uses a spectral background taken from the 
same detector region as the source spectrum, in a blank- 
sky dataset. The high background level in the source 
dataset means that its spectrum may be quite differ- 
ent from that of the blank-sky dataset. We attempt 
to account for this with the following method, based 
closely on that used by Arnaud et al. (2002). Briefly, 
the blank-sky background was scaled to the data using 
the ratio of the count rates in the whole field of view 
in the 12 — 14 keV band. A spectrum obtained from a 
background region of the data was subtracted from a 
corresponding blank-sky spectrum to produce a 'resid- 
ual spectrum'. This was subtracted from the blank-sky 
spectra to account for systematic residuals between the 
data and the blank-sky spectra. Generally, the residual 
spectrum is taken from further off-axis than the source, 
so will be more strongly vignetted. This means that any 
soft X-ray excess (or decrement) in the source data will 
be underestimated (or overestimated) to some extent. 

Spectra produced with this method were fit as be- 
fore, giving a temperature of T = 11.48 ± 1.45 keV 
(reduced x? /dof = 0.96/298), in excellent agreement 
with the temperature found with the non-weighted lo- 
cal background method above (T = 11.56 ± 1.26 keV). 

4.5 blank-sky background, with weighting 

The problem of the vignetting of the residual spectrum 
in the preceding method can be solved, in theory, by 
applying the weightings defined above to the source 
and blank-sky datasets, before following the method de- 
scribed in the preceding section. Again, the particle in- 
duced events will be artificially boosted by the weight- 
ing, but in this case, as the source and background spec- 
tra are extracted from the same regions, the boosting 
factor should be the same, and it will cancel, providing 
that the particle event level in the source and blank- 
sky sets are similar. Weighted spectra were produced, 
following the method above, and the best-fitting model 
had a temperature of T = 7.79 ± 1.12 keV (reduced 
X 2 /dof — 1.15/340). This is not consistent with the 
temperature found by the two non- weighted methods, 
suggesting again that the boosting of the particle events 
is a significant effect. 

4.6 Spectral analysis - summary and 
conclusions 

When applied to the low background data, all spectral 
analysis methods gave a temperature consistent with 
11.5 keV, with la errors of « ±2 keV. We believe that 
this consistency between the methods is due to the lower 
particle background in this period. In both the high 



background period, and combined periods, the results 
were consistent with the low background period when 
no weightings were used. We believe that the inconsis- 
tencies that emerged when weighting methods were used 
was due to the boosting of the higher particle levels in 
these data. In a further test, the absorbing column was 
allowed to vary, along with the temperature, in our anal- 
ysis of the combined period data. The Galactic value 
at the position of C1J1226. 9+3332 is 1.38 x 10 20 cm~ 2 
(Dickey & Lockman 1990); the best-fitting value with a 
local background spectrum was 1.6 ± 0.7 x 10 20 cm~ 2 
(T = 11.33 ± 1.55), while with a blank-sky background 
spectrum, the best fit was 5.0 ± 1.0 x 10 20 cm -2 (T = 
9.05± 1.18). This again shows the reliability of the local 
background method. All further analysis was performed 
on the combined period data, with a local background as 
this approach gives the best compromise between lim- 
iting systematic and statistical sources of uncertainty 
for these data. The non- weighted blank-sky method was 
used as a consistency check. 



5 SPECTRAL RESULTS 

The results of the fits to various combinations of the 
three XMM-Newton cameras are given in table 1. All 
quoted results were found using a local background with 
no weighting, though in each case, consistent results 
were found using a blank-sky background. All spectral 
fits for all combinations of cameras gave consistent re- 
sults. The spectra were fit in the 0.3 — 8 keV band, 
though we note that consistent results were also found 
when fitting in the 1 — 7 keV band. 

The simultaneous fit to the data from all three 
cameras was then investigated in more detail, with 
the abundance as a free parameter. The best-fitting 
model was T = 11.5±o.9 keV and Z = O.33±£'"Z 
(reduced x 2 = 1-07 for 502 degrees of freedom); this 
abundance is well constrained for a high-redshift clus- 
ter, and is in good agreement with that found in local 
clusters (the blank-sky method gave an abundance of 
Z = O.37tg;i£Z ). Fig. 6 shows the best-fitting PN and 
MOS spectra, produced using a local background. The 
spectra were grouped so that each bin contained a min- 
imum of 50 counts (PN) or 20 counts (MOS). 

The flux of C1J1226.9+3332 measured by ROSAT 
in the 0.5 — 2 keV passband was 3.4 ± 0.3 x 
10~ 13 erg s _1 cm" 2 . For comparison, the XMM-Newton 
flux in this band was 3.7l° J x 10~ 13 erg s _1 cm~ 2 (af- 
ter extrapolation to r2oo)- 

5.1 Temperature Profile 

A temperature profile was created by fitting spectra ex- 
tracted from annular bins centred on the X-ray centroid. 
In order to minimise the effect of the PSF, while main- 
taining a degree of spatial resolution, the annuli were 
chosen so that their width (or diameter in the case of 
the innermost bin) were > 15", which corresponds to 
the 70% encircled energy radius of the PSF. Spectra 
were fit as before in each of these annular bins, freezing 
the abundance at 0.3Z Q and the column density at the 
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Camera 



T(keV) Reduced \ 2 /dof 



PN 
MOS1 
MOS2 
MOS1+MOS2 



11.56 ± 1.26 
12.42 ± 2.80 
11.93 ± 1.76 
12.28 ± 1.89 



PN+MOS1+MOS2 11.55 ±0.86 



0.93/298 
0.98/98 
0.98/103 
0.99/203 
1.00/503 



Table 1. Summary of the results of spectral fits to different combinations of the XMM-Newton cameras with a local background 
and no weighting. The temperatures quoted were derived from spectral fits with abundances frozen at 0.3 solar, and the absorbing 
column frozen at the Galactic value. 




channel energy (keV) 

Figure 6. PN (upper) and MOS (lower) spectra with the 
best-fitting model. The ratio of data to model values is shown 
in the lower panel. A local background spectrum was used. 



Galactic value, using a local background, and fitting in 
the 0.3 — 8 keV band. The temperature profile is shown 
in Fig. 7. The profile is consistent with isothermality, al- 
beit with large errors, and shows no sign of any central 
cool gas. 

The effect of the projection of the emission from the 
gas in the outer annuli was then modelled with an 'onion 
skin' method. The temperature structure was modelled 
as a series of spherical shells (each of which was isother- 
mal), and the spectra were fit from the outermost shell 
in. The spectrum of a shell was modelled with a single 
temperature MeKaL component, plus a MeKaL com- 
ponent for each external shell, whose temperature was 
fixed at the value measured in that shell, and whose 
normalisation was multiplied by a factor. These factors 
accounted for the volume of each external shell along 
the line of sight to the shell being fit, and the variation 
in density across each external shell using the measured 
gas density profile. This deprojection procedure had no 
significant effect on the form of the temperature pro- 
file, and did not reveal any central cool gas, although 
the size of the errors was increased, as one would ex- 
pect, as there were less photons available to constrain 
the temperature of the free component in the interior 
bins. 



5.2 Entropy Profile 

The measurement of the gas entropy in groups and clus- 
ters of galaxies has provided evidence for some form 
of non-gravitational heating (e.g. Ponman et al. 1999; 
Lloyd-Davies et al. 2000; Ponman et al. 2003). In partic- 




Radius (arcsec) 

Figure 7. Temperature profile of C1J1226. 9+3332, based on 
spectra fit with abundance frozen at 0.3Zq, and a locally 
extracted background. Projected and deprojected tempera- 
tures are plotted, with the deprojected points offset by 2" for 
clarity. The solid line is the best-fitting global temperature, 
with la errors represented by the dashed lines. 



ular, if the entropy profiles of all systems are scaled by 
temperature, then cooler systems have a higher scaled 
entropy than hotter systems. This contrasts with the 
predictions of self-similar models, which include only 
gravitational heating, where all scaled-entropy profiles 
are identical. This indicates that non-gravitational heat- 
ing has an impact in cooler systems where it provides 
a significant fraction of the gas energy, while its ef- 
fect is not detectable in hotter systems. One would 
expect then, that an extremely hot system such as 
C1J1226. 9+3332 would have a similar entropy profile 
to other hot systems, and our temperature profile of 
this system allowed a rare opportunity to measure an 
entropy profile at high redshift. 

For consistency with other work (Ponman et al. 
1999; Lloyd-Davies et al. 2000; Ponman et al. 2003), 
we defined a pseudo-entropy, 



S = T/n 2 J 3 keV cm 



(3) 



It was then straightforward to produce the entropy pro- 
file shown in Fig. 8, using the gas density determined 
from the surface-brightness profile. The entropy was cal- 
culated assuming gas isothermality at 11.5 keV, and the 
data points show the entropies derived from the mea- 
sured temperatures in the projected temperature pro- 
file. 

It is interesting to note that the entropy observed 
at 0.1r2oo (~ 300 ± 40 keV cm 2 ) is significantly lower 
than that found in local systems of similar temperature 
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Figure 8. Gas entropy of C1J1226. 9+3332 versus radius as 
a fraction of r2oo- The gas was assumed to be isothermal at 
11.5 keV, and the data points give the entropies computed 
from the measured deprojected temperatures shown in Fig. 
7. 

at this radius. For example Ponman et al. (2003) find an 
entropy of m 550 ± 50 keV cm 2 in local systems above 
10 keV. This lower entropy could be explained by an 
underestimate of the temperature of C1J1226. 9+3332, 
however the temperature required to bring the entropy 
in line with the local systems is w 17 keV, which seems 
unlikely. On the other hand, the central electron density 
could be overestimated here. The measured value of n e 
was 0.0228 ± 0.001 cm~ 2 , and a reduction of « 50% 
is required to bring the entropy at 0.1r2oo in line with 
local values. As discussed in Section §6, the value of r2oo 
used here is subject to systematic uncertainties due to 
the assumptions made in extrapolating the mass profile. 
However, these tend to lead to an overestimate of r^oo, 
giving an overestimate of the entropy at 0.1r2oo so this 
is unlikely to be the cause of the difference between the 
entropy in local systems and that observed here. 

An alternative explanation is that we are observing 
entropy evolution, driven by the increasing density of 
the universe with redshift. Assuming simple self-similar 
scaling, the mean density within a given overdensity ra- 
dius (relative to the critical density) is proportional to 
H(z) 2 . The electron density then scales with redshift as 

n e (z) <x hZe(z) 2 , (4) 

where 

E(z) = (1 + 2 )(i + ^« + 7^-"a) ' -(5) 

Assuming that the redshift of observation is similar to 
the redshift of formation (or at least, the redshift at 
which the systems last virialised after a major merger), 
entropy, when scaled by system temperature, should 
therefore evolve as E(z)~ 4 ^ 3 . If the measured entropy 
in C1J1226.9+3332 at 0.1r„ is scaled by this factor (to 
give 588+78 keV cm 2 ), it is consistent with, the local 
(Ponman et al. 2003) value. We note that if the depen- 
dence of the density contrast A c (z) on cosmology and 
redshift (as described by Bryan & Norman (1998)) is 
included in the redshift-scaling of the C1J1226. 9+3332 
entropy, its value is slightly higher than, but still con- 
sistent with, the local (Ponman et al. 2003) value. This 



suggests that simple, self-similar arguments may explain 
ICM entropy evolution. Future papers will examine the 
evolution of entropy and other scaling relations using a 
sample of high-redshift clusters. 

5.3 Chandra spectral analysis 

A spectrum was also extracted from the archived Chan- 
dra observation of C1J1226. 9+3332, within a 60" radius 
circle, and with a background extracted from a large 
concentric annular region of the S3 chip (excluding point 
sources) . The quantum efficiency (QE) degradation suf- 
fered by Chandra since launch can cause significant 
overestimates of cluster temperatures if not modelled 
correctly (e.g. Maughan et al. 2003). To account for this, 
the Chandra spectrum was fit with an absorbed MeKaL 
model, including an extra ACISABS 2 absorption com- 
ponent. The observation of C1J1226. 9+3332 was taken 
376 days after launch. There are also uncertainties in the 
cross-calibration of the quantum efficiency of the front- 
illuminated (FI) and back-illuminated (BI) CCDs. It 
was initially thought that the QE curves were overesti- 
mated at low energies by « 7% for the FI chips 3 . A more 
recent reanalysis of pre-flight data has shown that the 
QE curves of the BI chips are underestimated by ~ 9% 4 . 
However, due to an additional (as yet unreleased) cor- 
rection that is required to the telescope effective area, 
the current best advice for measuring an accurate tem- 
perature using the back-illuminated S3 chip is not to 
apply any additional QE correction. Accordingly, none 
was applied, but we note that systematic uncertainties 
at the 10% level may exist. 

The fits were performed in the 0.6 — 8 keV band, 
with the column density frozen at the Galactic value, 
and the abundance at 0.3Z Q . The best-fitting model 
temperature was 12.61?, 2 keV, in good agreement with 
that measured by XMM-Newton. The best-fitting Chan- 
dra temperature was also found to be consistent when 
the spectrum was fit in the 1 — 8 keV band, where the 
effects of the quantum efficiency degradation are less 
severe. 

The unabsorbed flux measured by Chandra (0.5 — 
2 keV) was 3.6 + 0.1 x 10~ 13 erg s _1 cm~ 2 (after extrap- 
olation to r2oo), which is consistent with that measured 
by XMM-Newton and ROSAT . This shows again that 
point source contamination was not a problem in the 
XMM-Newton data. 



6 DETERMINATION OF GLOBAL 
PROPERTIES 

We have derived the luminosity, gas mass, total mass, 
and gas mass fraction within two different radii. The 
most reliable results are those obtained within the ex- 
tent of the data (r = 100", corresponding to an over- 
density A « 1000). The easiest results to compare with 

2 http:/ /www. astro. psu.edu/users/chartas/xcontdir/xcont. html 

3 http: / / cxc.harvard.edu/cal /Links / Acis / acis / CaLprods / qc /12_01_00 / 

4 http: / / cxc.harvard.edu/cal/Links / Acis/acis / CaLprods /qc/ 
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theoretical models of cluster growth are those extrapo- 
lated by a factor of 2.2 in radius to r 2 oo, but systematic 
uncertainties may be associated with the extrapolation. 

The method used was similar (but not identical) to 
that described in Maughan et al. (2003). Briefly, if the 
gas density profile is described by a /3-model, then under 
the assumptions of isothermality, hydrostatic equilib- 
rium, and spherical symmetry, the total density profile 
of the cluster is given by 



P(< r) 



M{< r) 
4/37rr 3 

2.70 x 10 13 /3- T 



keV V Mpc 
(r/rc)2 r M Mpc" 3 . 



l + (r/r c ) 2 ' 



(6) 



(7) 



7 VELOCITY DISPERSION 

CUf 226.9+3332 was observed by us on April 18, 2002 
with the LRIS spectrograph (Oke et al. 1995) on the 
Keck-I 10m telescope. We used the 600 1/mm grism 
blazed at lpim, and a multi-object spectroscopy mask 
with 1.25" wide slits. Further details of the observa- 
tional setup and the data reduction procedure will be 
provided in a future paper (Ebeling et al. 2003). From 12 
accurately measured cluster redshifts (individual radial 
velocity error less than 30 km s _1 ) and using a biweight 
estimator for the systemic cluster redshift z and the co- 
moving cluster velocity dispersion a we find z = 0.8920 
and a 

package (Beers et al. 1990). 

The observed velocity dispersion is consistent with 
the measured X-ray temperature, given the scatter in 
the local T — a relation of Xue & Wu (2000) The veloc- 
ity histogram, although poorly constrained with only 12 
velocities, shows no signs of significant substructure. 



997±2os km s_1 usm g the ROSTAT statistics 



Here, we have adopted a value of 0.59mj, for the 
mean molecular weight of the gas, where m p is the pro- 
ton mass. This density profile was used to estimate r*2oo, 
and the measured flux was extrapolated out to this ra- 
dius, and converted to a luminosity. The central gas 
density was computed from the measured MeKaL nor- 
malisation, and the measured gas density profile was 
integrated to give the gas mass. The total gravitating 
mass within V2oo was derived from Eqn. 7. 

The errors quoted on all non-observed quantities 
were derived from 10, 000 randomisations of the mea- 
sured quantities under the Gaussians described by their 
measured la errors. The properties of C1J1226. 9+3332 
are summarised in Table 2. Our assumption of isother- 
mality is supported by the measured temperature pro- 
file, and hardness-ratio mapping, while the relaxed ap- 
pearance of the X-ray emission, and the good fit of an 
isothermal f3— model to the data indicate that the gas 
is close to hydrostatic equilibrium. 

The extrapolation of the cluster properties out to 
large radii introduces systematic uncertainties which are 
not taken into account in the above method. In a sam- 
ple of 66 systems with measured temperature profiles 
Sanderson et al. (2003) found that the incorrect assump- 
tion of isothermality leads to an average overestimate of 
M200 by w 30% and r2oo by w 20%. The overestimation 
of r2oo leads in turn to an overestimation of M gas by 
w 25% at that radius (the r2oo and M gas uncertainties 
were provided by Sanderson (private communication)). 
These are taken as reasonable indications of the system- 
atic uncertainties on those properties, and are added in 
quadrature to the statistical errors derived above in the 
quoted values of these properties. 

We find a virial radius of r^oo = 1.66 + 0.34 Mpc for 
cluster C1J1226. 9+3332. This means that the properties 
of the system are directly measured out to w 0.45r2oo- 
Assuming an extrapolation of the surface-brightness 
profile is valid, it is interesting to note that while this ra- 
dius encloses 90% of the X-ray emission, it encloses only 
« 45% of the gas mass and total mass of the system. 



8 DISCUSSION 

C1J1226. 9+3332 is the highest temperature galaxy clus- 
ter known at z > 0.6, and, uniquely at these redshifts, 
is an extremely massive system (similar in mass to the 
Coma cluster) which appears to be relaxed. Images of 
both the XMM-Newton observation analysed here, and 
the archived Chandra observation show almost circu- 
lar isophotes, and no obvious large-scale substructure. 
Within the limits of the current data, the cluster is gen- 
erally isothermal (except for one small cooler region). 
The relaxed nature is further supported by the good 
agreement of the /3-model with the surface brightness 
distribution. This relaxed appearance is important in 
justifying the assumptions used to derive the total mass. 

The existence of even one high-redshift cluster of 
this mass can be used to constrain cosmological models. 
We initially test for consistency with the ACDM cosmol- 
ogy of Spergel et al. (2003) from Wilkinson Microwave 
Anisotropy Probe (WMAP) data, using their results 
based on a model with a constant spectral index of pri- 
mordial fluctuations. In this cosmology, at a redshift of 
0.89 we expect to see a density of systems more massive 
than CLJ1226.9 + 3332 of 4.86 x 10" 3 deg" 2 per unit 
z. We have adopted the Jenkins et al. (2001) halo mass 
function in this calculation, and converted between our 
mass definition (M200 relative to the critical density) 
and that of Jenkins et al. (2001) (Miso relative to the 
background density) via: M180/-M200 = 1.14, assuming 
an NFW (Navarro et al. 1996) profile with concentra- 
tion parameter c = 5. Given that CLJ1226.9 + 3332 was 
detectable in the WARPS over the full survey area of 
73 deg -2 and to a redshift of z — 1.8 and (very conser- 
vatively) assuming no further evolution in the cluster 
mass function beyond 2 = 0.89 we would expect a total 
of 0.64 such clusters in the entire survey. If the cluster 
mass within r2oo is ~ 30% lower, as estimated from the 
combination of systematic and statistical errors, then 
the predicted number of such clusters rises to 2.4. The 
detection of one such cluster is therefore consistent with 
this model. 
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A 


Redshift 


T(keV) 


L bo i(ergs 1 ) 


ct( km s 1 ) 


r c ( kpc) 


f3 r A (Mpc) 


M aas (M Q ) 


M tot (M ) 


fgas 


1000 
200 


0.892 


11 5+ 1 - 1 
-1-1-0-0.9 


4.8 ±0.1 x 10 45 
5.3 ± 0.2 x 10 45 


QQ7 + 285 
yy ' -205 


1181S 


0.66to o2 0.73 ± 0.04 
1.66 ± 0.34 


6.5 ±0.4 x 10 Ki 
1.7 ±0.4 x 10 14 


6-i±8;l x io« 

1.4 ±0.5 x 10 15 


0.11 ± 0.02 

0.12 ± 0.05 



Table 2. Summary of the measured and inferred properties of cluster C1J1226. 9+3332 based on XMM-Newton observations, 
assuming a cosmology of Dm = 0.3 (CIa = 0.7) and Ho = 70 km s — 1 Mpc — 1 . The first line gives the properties within the 
detection radius, corresponding to an overdensity of A = 1000. The second line gives the properties when extrapolated to an 
overdensity radius of A = 200. 



Interestingly, the predicted number reduces to 0.23 
in the running spectral index WMAP model in which 
the spectrum of primordial density fluctuations is a 
slowly changing power law as a function of scale and 
in which the third derivative of the inflation poten- 
tial plays a role (Peiris et al. 2003). This model was 
invoked (Spergel et al. 2003) primarily to investigate 
the apparent effects of combining other experimental 
CMB data with that of WMAP, in which the small an- 
gular scale amplitude of fluctuations seem to be sys- 
tematically lower than the overall best-fit amplitude. 
The existence of CLJ1226 therefore mildly disfavours 
the running index model. However, if the cluster mass 
is lower, but still within the measurement errors, then 
the predicted number of such clusters rises to 0.86, con- 
sistent with observation. The power of massive clusters 
at high redshift to discriminate between cosmologies is 
illustrated by this example, but a key requirement is ac- 
curate mass measurements from data extending to the 
virial radius. 

Although no longer a viable model we note for com- 
pleteness that the probability of observing a cluster of 
at least this mass in a high density (Qm = 1, Hq = 50) 
Universe is approximately 8 x 10 , or ~ 1/13,000 (or 
k 2 x 10~ 4 for a cluster mass at the low end of the 
measurement errors). 

In relaxed clusters, where the central gas cooling 
time is sufficiently low, gas may cool to a temperature 
of ~ 1/3 of that of the surrounding gas. The cooling 
time of the intra-cluster gas was estimated by divid- 
ing its thermal energy by its luminosity in a series of 
concentric spherical shells. The Chandra density pro- 
file was used for this because of its superior resolution, 
though the results from XMM-Newton were consistent. 
The radius within which the cooling time is less than the 
age of the universe at the cluster's redshift (6.22 Gyr) 
is 92 kpc (12") in our ACDM cosmology. There is no 
significant central excess emission seen, and the interior 
bin of the temperature profile shows no evidence for any 
cooler gas. The weak residual counts from the 2D sur- 
face brightness fitting were used to estimate that any 
central cool gas contributes less than 5% of the clus- 
ter luminosity (assuming a 5 keV MeKaL spectrum for 
the cool gas). Numerical simulations have shown that 
merger events can disrupt central cooling in clusters 
(e.g. Ritchie & Thomas 2002). A plausible explanation, 
then, for any lack of central cool gas is that the sys- 
tem is being observed after some recent minor merger. 
While the gas appears to have relaxed into hydrostatic 
equilibrium on large scales, traces may remain in the 
cooler gas observed to the west of centre, which may be 
an in-falling poor cluster or group. 

The gas mass fraction of C1J1226. 9+3332 mea- 



sured within the spectral extraction radius of 100" was 
0.11+0.02, and 0.12+0.05 when the mass profiles are ex- 
trapolated out to the virial radius. These values are con- 
sistent with those seen in local and intermediate-redshift 
clusters (Vikhlinin et al. 1999; Sadat & Blanchard 2001; 
Allen et al. 2002; Ettori et al. 2003). Allen et al. (2002) 
and Ettori et al. (2003) also use the apparent variation 
in fgas with redshift to constrain cosmological param- 
eters. The measurement of fgas presented here, along 
with others at similar redshifts will allow this method 
to be extended in redshift. 

The metal abundance of Z = 0.33^° '.io^© measured 
in C1J1226. 9+3332 is well constrained for such a high- 
redshift cluster, and is typical of values found in local 
clusters. This measurement is consistent with the lack 
of evolution in Fe abundance and high redshift of en- 
richment (z > 1) of the ICM proposed by Mushotzky 
& Loewenstein (1997) and recently confirmed by Tozzi 
et al. (2003). 

Luminous clusters like C1J1226. 9+3332, with mea- 
sured luminosities and temperatures provide useful tools 
for calibrating the luminosity-temperature (L-T) rela- 
tion at high redshifts. The luminosities predicted by 
two local L-T relations for a cluster with the temper- 
ature of C1J1226. 9+3332 were compared with the mea- 
sured luminosity. With the L-T relation expressed as 
L = A(T/6 keV) s , Arnaud & Evrard (1999) (here- 
after AE99) find A = 2.88 ± 0.20 x 10 44 /i w 2 erg s" 1 
(ftioo = Ho/100 km s" 1 Mpc" 1 ) and B = 2.88 ± 0.15, 
which predicts L = 3.8±l\ x 10 45 ergs" 1 . The L-T 
relation of Markevitch (1998) (hereafter M98) (A = 
3.11+0. 27xl0 45 fr~ 2 erg s"\ B = 2.64+0.27) predicts a 
luminosity of L — 3.51 2 / 4 , x 10 45 erg s" 1 . The measured 
luminosity of C1J1226. 9+3332 ( 5.3 + 0.2 x 10 45 erg s" 1 ) 
is higher than the predicted values, but not significantly 
so. The L-T relations above were derived for clusters 
with weak or absent cooling flows ( AE99) , or with cool- 
ing flow emission excluded (M98), so it should be rea- 
sonable to compare them with this cluster. The nor- 
malisation of the L-T relation (measured within a fixed 
overdensity radius) is predicted to evolve with redshift, 
by a factor E(z). The predicted luminosities, scaled 
by E(z) in our ACDM cosmology (1.65), increase to 
6.3t|gxl0 45 erg s" 1 (AE99), and 5.81 4 ; 1 ! x 10 45 erg s" 1 
(M98). These values agree well with the observed lu- 
minosity, although as stated above, the measured lumi- 
nosity is also consistent with no evolution. Including the 
redshift-dependence of the density contrast A c (z) in the 
predicted evolution does not affect this result. 

The same comparisons were made adopting a cos- 
mology of Ho = 50 km s _1 Mpc -1 and 0,m = 1 
(Qa = 0). In this cosmology, the observed luminosity of 
C1J1226.9+3332 was 6.1±0.2xl0 45 erg s _1 , and C(z) = 
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2.594. The predicted luminosities of both L-T relations 
agree well with the observed value without applying the 
evolution factor. When evolution is included, the pre- 
dicted luminosities are 19.5tg° 2 9 x 10 45 erg s _1 (AE99), 
and 18.7tJ, 3 ° x 10 45 erg s" 1 (M98). Thus, in this cos- 
mology the measured luminosity of C1J1226. 9+3332 is 
inconsistent with the predicted evolution of the L-T re- 
lation, at the « 2<r level. 



9 CONCLUSIONS 

C1J1226. 9+3332 is a remarkable and unique cluster. We 
have performed a detailed analysis of an XMM-Newton 
observation, and after careful comparison of background 
subtraction methods, we have confirmed its high tem- 
perature, and produced a temperature profile for the 
first time at this high redshift (z = 0.89). The total 
mass is found to be extremely high (1.4 + 0.5 x 1O 15 M0) 
and similar to that of the Coma cluster. The probability 
of such a cluster being found in the discovery survey is 
0.64 (assuming a ACDM cosmology). 

The relaxed, and generally isothermal, X-ray ap- 
pearance, together with the gas mass fraction, metal 
abundance, and gas density profile slope (f3) all being 
consistent with those of local clusters, suggests that this 
cluster was assembled significantly earlier than z=0.9. 

The high luminosity and relaxed nature make it an 
extremely useful subject for further studies of the gas, 
dark matter and galaxy properties out to large radii at 
high redshift. Deeper Chandra and XMM-Newton ob- 
servations are planned, in part to test the assumptions 
of isothermality and hydrostatic equilibrium which un- 
derpin the derivations of many of the cluster properties. 



10 ACKNOWLEDGEMENTS 

We thank Eric Perlman, Pasquale Mazzotta, and 
Monique Arnaud for discussions of this work, and Eliz- 
abeth Barrett for her work on the Keck spectroscopy. 
We thank Zoltan Haiman for his help with cosmological 
modelling. The referee made useful comments which im- 
proved this paper. BJM is supported by a PPARC post- 
graduate studentship. HE and CS gratefully acknowl- 
edge financial support from NASA grant NAG 5-10085. 



REFERENCES 

Allen S. W., Ettori S., Fabian A. C, 2001, MNRAS, 
324, 877 

Allen S. W., Schmidt R. W., Fabian A. C, 2002, MN- 
RAS, 334, Lll 
Arnaud M., Evrard A. E., 1999, MNRAS, 305, 631 
Arnaud M., Majerowicz S., Lumb D., Neumann D. M., 
Aghanim N., Blanchard A., Boer M., Burke D. J., 
Collins C. A., Giard M., Nevalainen J., Nichol R. C, 
Romer A. K., Sadat R., 2002, A&A, 390, 27 
Beers T. C, Flynn K., Gebhardt K., 1990, AJ, 100, 32 
Bryan G. L., Norman M. L., 1998, ApJ, 495, 80 
Cagnoni I., Elvis M., Kim D.-W., Mazzotta P., Huang 
J.-S., Celotti A., 2001, ApJ, 560, 86 



Cavaliere A., Fusco-Femiano R., 1976, A&A, 49, L137 
Dickey J. M., Lockman F. J., 1990, ARA&A, 28, 215 
Ebeling H., Barrett E., Jones L. R., Maughan B. J., 

Perlman E., Scharf C, Horner D., 2003, in prep. 
Ebeling H., Jones L. R., Fairley B. W., Perlman E., 

Scharf C, Horner D., 2001, ApJ, 548, L23 
Ebeling H., Jones L. R., Perlman E., Scharf C, Horner 

D. , Wegner G., Malkan M., Fairley B., Mullis C. R., 
2000, ApJ, 534, 133 

Ettori S., Tozzi P., Rosati P., 2003, A&A, 398, 879 

Fabian A. C, 1994, ARA&A, 32, 277 

Fabian A. C, Sanders J. S., Ettori S., Taylor G. B., 
Allen S. W., Crawford C. S., Iwasawa K., Johnstone 
R. M., 2001, MNRAS, 321, L33 

Jenkins A., Frenk C. S., White S. D. M., Colberg J. M., 
Cole S., Evrard A. E., Couchman H. M. P., Yoshida 
N., 2001, MNRAS, 321, 372 

Jones C, Forman W., 1999, ApJ, 511, 65 

Jones L. R., Scharf C, Ebeling H., Perlman E., Wegner 
G., Malkan M., Horner D., 1998, ApJ, 495, 100 

Joy M., LaRoque S., Grego L., Carlstrom J. E., Daw- 
son K., Ebeling H., Holzapfel W. L., Nagai D., Reese 

E. D., 2001, ApJ, 551, LI 

Lloyd-Davies E. J., Ponman T. J., Cannon D. B., 2000, 

MNRAS, 315, 689 
Lumb D. H., Warwick R. S., Page M., De Luca A., 

2002, A&A, 389, 93 
Markevitch M., 1998, ApJ, 504, 27 

Markevitch M., Vikhlinin A., Mazzotta P., 2001, astro- 
ph/0108520 

Maughan B. J., Jones L. R., Ebeling H., Perlman E., 
Rosati P., Frye C, Mullis C. R., 2003, ApJ, 587, 589 

Mazzotta P., Markevitch M., Forman W. R., Jones C, 
Vikhlinin A., VanSpeybroeck L., 2002, ApJ, submit- 
ted 

Mushotzky R. F., Loewenstein M., 1997, ApJ, 481, L63 
Navarro J. F., Frenk C. S., White S. D. M., 1996, ApJ, 
462, 563 

Oke J. B., Cohen J. C, Carr M., Cromer J., Dingizian 
A., Harris F. H., Labrecque S., Lucinio R., Schaal W., 
Epps H., Miller J., 1995, PASP, 107, 375 

Peiris H. V., Komatsu E., Verde L., Spergel D. N., 
Bennett C. L., Halpern M., Hinshaw G., Jarosik N., 
Kogut A., Limon M., Meyer S. S., Page L., Tucker 
G. S., Wollack E., Wright E. L., 2003, ApJS, 148, 213 

Perlman E. S., Horner D. J., Jones L. R., Scharf C. A., 
Ebeling H., Wegner G., Malkan M., 2002, ApJS, 140, 
265 

Ponman T. J., Cannon D. B., Navarro J. F., 1999, 

Nature, 397, 135 
Ponman T. J., Sanderson A. J. R., Finoguenov A., 

2003, MNRAS, 343, 331 

Ritchie B. W., Thomas P. A., 2002, MNRAS, 329, 675 
Sadat R., Blanchard A., 2001, A&A, 371, 19 
Sanderson A. J. R., Ponman T. J., Finoguenov A., 

Lloyd-Davies E. J., Markevitch M., 2003, MNRAS, 

340, 989 

Scharf C, Jones L. R., Ebeling H., Perlman E., Malkan 
M., Wegner G., 1997, ApJ, 477, 79 

Spergel D. N., Verde L., Peiris H. V., Komatsu E., 
Nolta M. R., Bennett C. L., Halpern M., Hinshaw G., 
Jarosik N., Kogut A., Limon M., Meyer S. S., Page 



© 0000 RAS, MNRAS 000, 000-000 



An XMM-Newton observation of CU1226. 9+3332 



L., Tucker G. S, Weiland J. L., Wollack E., Wright 

E. L., 2003, ApJS, 148, 175 
Tozzi P., Rosati P., Ettori S., Borgani S., Mainieri V., 

Norman C, 2003, ApJ, 593, 705 
Vikhlinin A., Forman W., Jones C, 1999, ApJ, 525, 47 
Xue Y., Wu X., 2000, ApJ, 538, 65 



© 0000 RAS, MNRAS 000, 000-000 



